25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.
|
|
- Leon Pearson
- 5 years ago
- Views:
Transcription
1 CS/ECE/ISyE 524 Introduction to Optimization Spring NLP algorithms ˆ Overview ˆ Local methods ˆ Constrained optimization ˆ Global methods ˆ Black-box methods ˆ Course wrap-up Laurent Lessard (
2 Review of algorithms Studying Linear Programs, we talked about: ˆ Simplex method: traverse the surface of the feasible polyhedron looking for the vertex with minimum cost. Only applicable for linear programs. Used by solvers such as Clp and CPLEX. Hybrid versions used by Gurobi and Mosek. ˆ Interior point methods: traverse the inside of the feasible polyhedron and move towards the boundary point with minimum cost. Applicable to many different types of optimization problems. Used by SCS, ECOS, Ipopt. 25-2
3 Review of algorithms Studying Mixed Integer Programs, we talked about: ˆ Cutting plane methods: solve a sequence of LP relaxations and keep adding cuts (special extra linear constraints) until solution is integral, and therefore optimal. Also applicable for more general convex problems. ˆ Branch and bound methods: solve a sequence of LP relaxations (upper bounding), and branch on fractional variables (lower bounding). Store problems in a tree, prune branches that aren t fruitful. Most optimization problems can be solved this way. You just need a way to branch (split the feasible set) and a way to bound (efficiently relax). ˆ Variants of methods above are used by all MIP solvers. 25-3
4 Overview of NLP algorithms To solve Nonlinear Programs with continuous variables, there is a wide variety of available algorithms. We ll assume the problem has the standard form: minimize x f 0 (x) subject to: f i (x) 0 for i = 1,..., m ˆ What works best depends on the kind of problem you re solving. We need to talk about problem categories. 25-4
5 Overview of NLP algorithms 1. Are the functions differentiable? Can we efficiently compute gradients or second derivatives of the f i? 2. What problem size are we dealing with? a few variables and constraints? hundreds? thousands? millions? 3. Do we want to find local optima, or do we need the global optimum (more difficult!) 4. Does the objective function have a large number of local minima? or a relatively small number? Note: items 3 and 4 don t matter if the problem is convex. In that case any local minimum is also a global minimum! 25-5
6 Survey of NLP algorithms ˆ Local methods using derivative information. It s what most NLP solvers use (and what most JuMP solvers use). unconstrained case constrained case ˆ Global methods ˆ Derivative-free methods 25-6
7 Local methods using derivatives Let s start with the unconstrained case: minimize x f (x) slow fast Stochastic gradient descent Gradient descent Accelerated methods Conjugate gradient Many methods available! Quasi-Newton methods Newton s method cheap expensive 25-7
8 Iterative methods Local methods iteratively step through the space looking for a point where f (x) = pick a starting point x choose a direction to move in k. This is the part where different algorithms do different things. 3. update your location x k+1 = x k + k 4. repeat until you re happy with the function value or the algorithm has ceased to make progress. 25-8
9 Vector calculus Suppose f : R n R is a twice-differentiable function. ˆ The gradient of f is a function f : R n R n defined by: [ f ]i = f x i f (x) points in the direction of greatest increase of f at x. ˆ The Hessian of f is a function 2 f : R n R n n where: [ 2 f ] ij = 2 f x i x j 2 f (x) is a matrix that encodes the curvature of f at x. 25-9
10 Vector calculus Example: suppose f (x, y) = x 2 + 3xy + 5y 2 7x + 2 ] [ ] 2x + 3y 7 ˆ f = = 3x + 10y ˆ 2 f = [ f x f y [ 2 f x 2 2 f x y 2 f x y 2 f y 2 ] = [ ] Taylor s theorem in n dimensions best linear approximation {}}{ f (x) f (x 0 ) + f (x 0 ) T (x x 0 ) (x x 0) T 2 f (x 0 )(x x 0 ) + }{{} best quadratic approximation 25-10
11 Gradient descent ˆ The simplest of all iterative methods. It s a first-order method, which means it only uses gradient information: x k+1 = x k t k f (x k ) ˆ f (x k ) points in the direction of local steepest decrease of the function. We will move in this direction. ˆ t k is the stepsize. Many ways to choose it: Pick a constant tk = t Pick a slowly decreasing stepsize, such as tk = 1/ k Exact line search: tk = arg min t f (x k t f (x k )). A heuristic method (most common in practice). Example: backtracking line search
12 Gradient descent We can gain insight into the effectiveness of a method by seeing how it performs on a quadratic: f (x) = 1x T Qx. The 2 condition number κ := λmax(q) λ min determines convergence. (Q) Optimal step Shorter step Even shorter κ = 10 Optimal step Shorter step Even shorter κ = distance to optimal point distance to optimal point Optimal step Shorter step Even shorter number of iterations Optimal step Shorter step Even shorter number of iterations 25-12
13 Gradient descent Advantages ˆ Simple to implement and cheap to execute. ˆ Can be easily adjusted. ˆ Robust in the presence of noise and uncertainty. Disadvantages ˆ Convergence is slow. ˆ Sensitive to conditioning. Even rescaling a variable can have a substantial effect on performance! ˆ Not always easy to tune the stepsize. Note: The idea of preconditioning (rescaling) before solving adds another layer of possible customizations and tradeoffs
14 Other first-order methods Accelerated methods (momentum methods) ˆ Still a first-order method, but makes use of past iterates to accelerate convergence. Example: the Heavy-ball method: x k+1 = x k α k f (x k ) + β k (x k x k 1 ) Other examples: Nesterov, Beck & Teboulle, others. ˆ Can achieve substantial improvement over gradient descent with only a moderate increase in computational cost ˆ Not as robust to noise as gradient descent, and can be more difficult to tune because there are more parameters
15 Other first-order methods Mini-batch stochastic gradient descent (SGD) ˆ Useful if f (x) = N i=1 f i(x). Use direction i S f i(x k ) where S {1,..., N}. Size of S determines batch size. S = 1 is SGD and S = N is ordinary gradient descent. ˆ Same pros and cons as gradient descent, but allows further tradeoff of speed vs computation. ˆ Industry standard for big-data problems like deep learning. Nonlinear conjugate gradient ˆ Variant of the standard conjugate gradient algorithm for solving Ax = b, but adapted for use in general optimization. ˆ Requires more computation than accelerated methods. ˆ Converges exactly in a finite number of steps when applied to quadratic functions
16 Newton s method Basic idea: approximate the function as a quadratic, move directly to the minimum of that quadratic, and repeat. ˆ If we re at x k, then by Taylor s theorem: f (x) f (x k )+ f (x k ) T (x x 0 )+ 1 2 (x x k) T 2 f (x k )(x x k ) ˆ If 2 f (x k ) 0, the minimum of the quadratic occurs at: x k+1 := x opt = x k 2 f (x k ) 1 f (x k ) ˆ Newton s method is a second-order method; it requires computing the Hessian (second derivatives)
17 Newton s method in 1D Example: f (x) = log(e x+3 + e 2x+2 ) starting at: x 0 = (x 1, f 1 ) (x 0, f 0 ) 3.4 (x 2, f 2 ) x example by: L. El Ghaoui, UC Berkeley, EE127a 25-17
18 Newton s method in 1D 60 (x 1, f 1 ) Example: f (x) = log(e x+3 + e 2x+2 ) starting at: x 0 = divergent! x 2 = (x 0, f 0 ) x example by: L. El Ghaoui, UC Berkeley, EE127a 25-18
19 Newton s method Advantages ˆ It s usually very fast. Converges to the exact optimum in one iteration if the objective is quadratic. ˆ It s scale-invariant. Convergence rate is not affected by any linear scaling or transformation of the variables. Disadvantages ˆ If n is large, storing the Hessian (an n n matrix) and computing 2 f (x k ) 1 f (x k ) can be prohibitively expensive. ˆ If 2 f (x k ) 0, Newton s method may converge to a local maximum or a saddle point. ˆ May fail to converge at all if we start too far from the optimal point
20 Quasi-Newton methods ˆ An approximate Newton s method that doesn t require computing the Hessian. ˆ Uses an approximation H k 2 f (x k ) 1 that can be updated directly and is faster to compute than the full Hessian. x k+1 = x k H k f (x k ) H k+1 = g(h k, f (x k ), x k ) ˆ Several popular update schemes for H k : DFP (Davidon Fletcher Powell) BFGS (Broyden Fletcher Goldfarb Shanno) 25-20
21 Example ˆ f (x, y) = e (x 3)/2 + e (x+4y)/10 + e (x 4y)/10 ˆ Function is smooth, with a single minimum near (4.03, 0) Gradient Nesterov BFGS Newton y x 25-21
22 Example Plot showing iterations to convergence: distance to optimal point Gradient Nesterov BFGS Newton number of iterations ˆ Illustrates the complexity vs performance tradeoff. ˆ Nesterov s method doesn t always converge uniformly. ˆ Julia code: IterativeMethods.ipynb 25-22
23 Recap of local methods Important: For any of the local methods we ve seen, if f (x k ) = 0, then x k+1 = x k and we we won t move! slow fast Stochastic gradient descent Gradient descent Accelerated methods Conjugate gradient Quasi-Newton methods Newton s method cheap expensive 25-23
24 Constrained local optimization Algorithms we ve seen so far are designed for unconstrained optimization. How do we deal with constraints? ˆ We ll revisit interior point methods, and we ll also talk about a class of algorithms called active set methods. ˆ These are among the most popular methods for smooth constrained optimization
25 Interior point methods minimize x f 0 (x) subject to: f i (x) 0 Basic idea: augment the objective function using a barrier that goes to infinity as we approach a constraint. minimize x m f 0 (x) µ log ( f i (x) ) i=1 Then, alternate between (1) an iteration of an unconstrained method (usually Newton s) and (2) shrinking µ toward zero
26 Interior point methods 4 3 Example: f 0 (x) = 1x with 2 x 2. µ = 0.5 µ = 0.2 µ = x 25-26
27 Active set methods minimize x f 0 (x) subject to: f i (x) 0 Basic idea: at optimality, some of the constraints will be active (equal to zero). The others can be ignored. ˆ given some active set, we can solve or approximate the solution of the simultaneous equalities (constraints not in the active set are ignored). Approximations typically use linear (LP) or quadratic (QP) functions. ˆ inequality constraints are then added or removed from the active set based on certain rules, then repeat. ˆ the simplex method is an example of an active set method
28 NLP solvers in JuMP ˆ Ipopt (Interior Point OPTimizer) uses an interior point method to handle constraints. If second derivative information is available, it uses a sparse Newton iteration, otherwise it uses a BFGS or SR1 (another Quasi-Newton method). ˆ Knitro (Nonlinear Interior point Trust Region Optimization) implements four different algorithms. Two are interior point (one is algebraic, the other uses conjugate-gradient as the solver). The other two are active set (one uses sequential LP approximations, the other uses sequential QP approximations). ˆ NLopt is an open-source platform that interfaces with many (currently 43) different solvers. Only a handful are currently available in JuMP, but some are global/derivative-free
29 NLopt solvers Algorithms LD_AUGLAG LD_AUGLAG_EQ LD_CCSAQ LD_LBFGS_NOCEDAL LD_LBFGS LD_MMA LD_SLSQP LD_TNEWTON LD_TNEWTON_RESTART LD_TNEWTON_PRECOND LD_TNEWTON_PRECOND_RESTART LD_VAR1 LD_VAR2 LN_AUGLAG LN_AUGLAG_EQ LN_BOBYQA LN_COBYLA LN_NEWUOA LN_NEWUOA_BOUND LN_NELDERMEAD LN_PRAXIS LN_SBPLX GD_MLSL GD_MLSL_LDS GD_STOGO GD_STOGO_RAND GN_CRS2_LM GN_DIRECT GN_DIRECT_L GN_DIRECT_L_RAND GN_DIRECT_NOSCAL GN_DIRECT_L_NOSCAL GN_DIRECT_L_RAND_NOSCAL GN_ESCH GN_ISRES GN_MLSL GN_MLSL_LDS GN_ORIG_DIRECT GN_ORIG_DIRECT_L ˆ L/G: local/global method ˆ D/N: derivative-based/derivative-free ˆ mostly implemented in C++, some work with Julia/JuMP 25-29
30 Global methods A global method makes an effort to find a global optimum rather than just a local one. ˆ If gradients are available, the standard (and obvious) thing to do is multistart (also known as random restarts). Randomly pepper the space with initial points. Run your favorite local method starting from each point (these runs can be executed in parallel). Compare the different local minima found. ˆ The number of restarts required depends on the size of the space and how many local minima it contains
31 Global methods A global method makes an effort to find a global optimum rather than just a local one. ˆ A more sophisticated approach: Systematically partition the space using a branch-and-bound technique. Search the smaller spaces using local gradient-based search. ˆ Knowledge of derivatives is required for both the bounding and local optimization steps
32 Black-box methods What if no derivative information is available and all we can do is compute f (x)? We must resort to black-box methods (also known as: derivative-free or direct search methods). If f is smooth: ˆ Approximate the derivative numerically by using finite differences, and then use a standard gradient-based method. ˆ Use coordinate descent: pick one coordinate, perform a line search, then pick the next coordinate, and keep cycling. ˆ Stochastic Approximation (SA), Random Search (RS), and others: pick a random direction, perform line search, repeat
33 Black-box methods What if no derivative information is available and f is not smooth? (you re usually in trouble) Pattern search: Search in a grid and refine the grid adaptively in areas where larger variations are observed. Genetic algorithms: Randomized approach that simulates a population of candidate points and uses a combination of mutation and crossover at each iteration to generate new candidate points. The idea is to mimic natural selection. Simulated annealing: Randomized approach using gradient descent that is perturbed in proportion to a temperature parameter. Simulation continues as the system is progressively cooled. The idea is to mimic physics / crystalization
34 Optimization at UW Madison ˆ Linear programming and related topics CS 525: linear programming methods CS 526: advanced linear programming ˆ Convex optimization and iterative algorithms CS 726: nonlinear optimization I CS 727: nonlinear optimization II CS 727: convex analysis ˆ MIP and combinatorial optimization CS 425: introduction to combinatorial optimization CS 577: introduction to algorithms CS 720: integer programming CS 728: integer optimization 25-34
35 External resources Continuous optimization ˆ Lieven Vandenberghe (UCLA) vandenbe/ ˆ Stephen Boyd (Stanford) boyd/ ˆ Ryan Tibshirani (CMU) ryantibs/convexopt/ ˆ L. El Ghaoui (Berkeley) elghaoui/ Discrete optimization ˆ Dimitris Bertsimas (MIT) integer programming ˆ AM121 (Harvard) intro to optimization
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationA Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008
A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some
More informationA Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008
A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some
More informationCS281 Section 3: Practical Optimization
CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationIntroduction to Optimization Problems and Methods
Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex
More informationAPPLIED OPTIMIZATION WITH MATLAB PROGRAMMING
APPLIED OPTIMIZATION WITH MATLAB PROGRAMMING Second Edition P. Venkataraman Rochester Institute of Technology WILEY JOHN WILEY & SONS, INC. CONTENTS PREFACE xiii 1 Introduction 1 1.1. Optimization Fundamentals
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More informationNumerical Optimization: Introduction and gradient-based methods
Numerical Optimization: Introduction and gradient-based methods Master 2 Recherche LRI Apprentissage Statistique et Optimisation Anne Auger Inria Saclay-Ile-de-France November 2011 http://tao.lri.fr/tiki-index.php?page=courses
More informationDavid G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer
David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Ellipsoid Methods Barnabás Póczos & Ryan Tibshirani Outline Linear programs Simplex algorithm Running time: Polynomial or Exponential? Cutting planes & Ellipsoid methods for
More informationLecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman
Lecture 4 3B1B Optimization Michaelmas 2017 A. Zisserman Convexity Robust cost functions Optimizing non-convex functions grid search branch and bound simulated annealing evolutionary optimization The Optimization
More information5 Machine Learning Abstractions and Numerical Optimization
Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer
More informationIntroduction to optimization methods and line search
Introduction to optimization methods and line search Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi How to find optimal solutions? Trial and error widely used in practice, not efficient and
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationClassical Gradient Methods
Classical Gradient Methods Note simultaneous course at AMSI (math) summer school: Nonlin. Optimization Methods (see http://wwwmaths.anu.edu.au/events/amsiss05/) Recommended textbook (Springer Verlag, 1999):
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationMultivariate Numerical Optimization
Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear
More informationINTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING
INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING DAVID G. LUENBERGER Stanford University TT ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California London Don Mills, Ontario CONTENTS
More informationNumerical Optimization
Numerical Optimization Quantitative Macroeconomics Raül Santaeulàlia-Llopis MOVE-UAB and Barcelona GSE Fall 2018 Raül Santaeulàlia-Llopis (MOVE-UAB,BGSE) QM: Numerical Optimization Fall 2018 1 / 46 1 Introduction
More informationToday. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient
Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationConstrained and Unconstrained Optimization
Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical
More informationIntroduction to Linear Programming. Algorithmic and Geometric Foundations of Optimization
Introduction to Linear Programming Algorithmic and Geometric Foundations of Optimization Optimization and Linear Programming Mathematical programming is a class of methods for solving problems which ask
More informationLecture 15: Log Barrier Method
10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 15: Log Barrier Method Scribes: Pradeep Dasigi, Mohammad Gowayyed Note: LaTeX template courtesy of UC Berkeley EECS dept.
More informationLecture 12: convergence. Derivative (one variable)
Lecture 12: convergence More about multivariable calculus Descent methods Backtracking line search More about convexity (first and second order) Newton step Example 1: linear programming (one var., one
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Single-State Meta-Heuristics Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Recap: Goal is to Find the Optimum Challenges of general optimization
More informationCrash-Starting the Simplex Method
Crash-Starting the Simplex Method Ivet Galabova Julian Hall School of Mathematics, University of Edinburgh Optimization Methods and Software December 2017 Ivet Galabova, Julian Hall Crash-Starting Simplex
More informationLecture 6 - Multivariate numerical optimization
Lecture 6 - Multivariate numerical optimization Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University February 13, 2014 1 / 36 Table of Contents 1 Plotting functions of two variables
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 17: The Simplex Method Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 10, 2010 Frazzoli (MIT)
More informationMarch 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms
Olga Galinina olga.galinina@tut.fi ELT-53656 Network Analysis and Dimensioning II Department of Electronics and Communications Engineering Tampere University of Technology, Tampere, Finland March 19, 2014
More informationExperimental Data and Training
Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion
More informationLecture 12: Feasible direction methods
Lecture 12 Lecture 12: Feasible direction methods Kin Cheong Sou December 2, 2013 TMA947 Lecture 12 Lecture 12: Feasible direction methods 1 / 1 Feasible-direction methods, I Intro Consider the problem
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Introduction and Motivation Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Organizational Stuff Lectures: Tuesday 11:00 12:30 in room SR015 Cover
More informationComputational Methods. Constrained Optimization
Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on
More information15. Cutting plane and ellipsoid methods
EE 546, Univ of Washington, Spring 2012 15. Cutting plane and ellipsoid methods localization methods cutting-plane oracle examples of cutting plane methods ellipsoid method convergence proof inequality
More informationLECTURE NOTES Non-Linear Programming
CEE 6110 David Rosenberg p. 1 Learning Objectives LECTURE NOTES Non-Linear Programming 1. Write out the non-linear model formulation 2. Describe the difficulties of solving a non-linear programming model
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationEllipsoid Algorithm :Algorithms in the Real World. Ellipsoid Algorithm. Reduction from general case
Ellipsoid Algorithm 15-853:Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods First polynomial-time algorithm for linear programming (Khachian 79)
More information6 BLAS (Basic Linear Algebra Subroutines)
161 BLAS 6.1 Motivation 6 BLAS (Basic Linear Algebra Subroutines) 6.1 Motivation How to optimise programs that use a lot of linear algebra operations? Efficiency depends on but also on: processor speed
More informationShort Reminder of Nonlinear Programming
Short Reminder of Nonlinear Programming Kaisa Miettinen Dept. of Math. Inf. Tech. Email: kaisa.miettinen@jyu.fi Homepage: http://www.mit.jyu.fi/miettine Contents Background General overview briefly theory
More informationOptimal Control Techniques for Dynamic Walking
Optimal Control Techniques for Dynamic Walking Optimization in Robotics & Biomechanics IWR, University of Heidelberg Presentation partly based on slides by Sebastian Sager, Moritz Diehl and Peter Riede
More informationHyperparameter optimization. CS6787 Lecture 6 Fall 2017
Hyperparameter optimization CS6787 Lecture 6 Fall 2017 Review We ve covered many methods Stochastic gradient descent Step size/learning rate, how long to run Mini-batching Batch size Momentum Momentum
More informationB553 Lecture 12: Global Optimization
B553 Lecture 12: Global Optimization Kris Hauser February 20, 2012 Most of the techniques we have examined in prior lectures only deal with local optimization, so that we can only guarantee convergence
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More informationPreface. and Its Applications 81, ISBN , doi: / , Springer Science+Business Media New York, 2013.
Preface This book is for all those interested in using the GAMS technology for modeling and solving complex, large-scale, continuous nonlinear optimization problems or applications. Mainly, it is a continuation
More informationNonlinear Programming
Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,
More informationA Study on the Optimization Methods for Optomechanical Alignment
A Study on the Optimization Methods for Optomechanical Alignment Ming-Ta Yu a, Tsung-Yin Lin b *, Yi-You Li a, and Pei-Feng Shu a a Dept. of Mech. Eng., National Chiao Tung University, Hsinchu 300, Taiwan,
More informationIntroduction to Design Optimization: Search Methods
Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape
More informationFundamentals of Integer Programming
Fundamentals of Integer Programming Di Yuan Department of Information Technology, Uppsala University January 2018 Outline Definition of integer programming Formulating some classical problems with integer
More informationLECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach
LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction
More information1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics
1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!
More informationSelected Topics in Column Generation
Selected Topics in Column Generation February 1, 2007 Choosing a solver for the Master Solve in the dual space(kelly s method) by applying a cutting plane algorithm In the bundle method(lemarechal), a
More informationCMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro
CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and
More informationIE598 Big Data Optimization Summary Nonconvex Optimization
IE598 Big Data Optimization Summary Nonconvex Optimization Instructor: Niao He April 16, 2018 1 This Course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications
More informationOperations Research and Optimization: A Primer
Operations Research and Optimization: A Primer Ron Rardin, PhD NSF Program Director, Operations Research and Service Enterprise Engineering also Professor of Industrial Engineering, Purdue University Introduction
More information6 Randomized rounding of semidefinite programs
6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can
More informationMulti Layer Perceptron trained by Quasi Newton learning rule
Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and
More informationCHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM
20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:
More informationGlobal Optimization. R. Horst and P.M. Pardalos (eds.), Handbook of Global Optimization, Kluwer, Dordrecht, 1995.
AMSC 607 / CMSC 764 Advanced Numerical Optimization Fall 2008 UNIT 4: Special Topics PART 2: Global Optimization Dianne P. O Leary c 2008 Global Optimization Local Optimization Problem: Find x S such that
More informationAlgorithm Design (4) Metaheuristics
Algorithm Design (4) Metaheuristics Takashi Chikayama School of Engineering The University of Tokyo Formalization of Constraint Optimization Minimize (or maximize) the objective function f(x 0,, x n )
More informationWE consider the gate-sizing problem, that is, the problem
2760 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL 55, NO 9, OCTOBER 2008 An Efficient Method for Large-Scale Gate Sizing Siddharth Joshi and Stephen Boyd, Fellow, IEEE Abstract We consider
More informationLocal Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:
Local Search Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: Select a variable to change Select a new value for that variable Until a satisfying assignment is
More informationLecture 18: March 23
0-725/36-725: Convex Optimization Spring 205 Lecturer: Ryan Tibshirani Lecture 8: March 23 Scribes: James Duyck Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationCharacterizing Improving Directions Unconstrained Optimization
Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not
More information13. Cones and semidefinite constraints
CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 13. Cones and semidefinite constraints ˆ Geometry of cones ˆ Second order cone programs ˆ Example: robust linear program ˆ Semidefinite constraints
More informationHill Climbing. Assume a heuristic value for each assignment of values to all variables. Maintain an assignment of a value to each variable.
Hill Climbing Many search spaces are too big for systematic search. A useful method in practice for some consistency and optimization problems is hill climbing: Assume a heuristic value for each assignment
More informationA Generic Separation Algorithm and Its Application to the Vehicle Routing Problem
A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem Presented by: Ted Ralphs Joint work with: Leo Kopman Les Trotter Bill Pulleyblank 1 Outline of Talk Introduction Description
More informationTested Paradigm to Include Optimization in Machine Learning Algorithms
Tested Paradigm to Include Optimization in Machine Learning Algorithms Aishwarya Asesh School of Computing Science and Engineering VIT University Vellore, India International Journal of Engineering Research
More informationAlgorithms for Decision Support. Integer linear programming models
Algorithms for Decision Support Integer linear programming models 1 People with reduced mobility (PRM) require assistance when travelling through the airport http://www.schiphol.nl/travellers/atschiphol/informationforpassengerswithreducedmobility.htm
More informationLinear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?
Linear and Integer Programming 15-853:Algorithms in the Real World Linear and Integer Programming I Introduction Geometric Interpretation Simplex Method Linear or Integer programming maximize z = c T x
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 Outline Local search techniques and optimization Hill-climbing
More information15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs
15.082J and 6.855J Lagrangian Relaxation 2 Algorithms Application to LPs 1 The Constrained Shortest Path Problem (1,10) 2 (1,1) 4 (2,3) (1,7) 1 (10,3) (1,2) (10,1) (5,7) 3 (12,3) 5 (2,2) 6 Find the shortest
More informationPenalty Alternating Direction Methods for Mixed- Integer Optimization: A New View on Feasibility Pumps
Penalty Alternating Direction Methods for Mixed- Integer Optimization: A New View on Feasibility Pumps Björn Geißler, Antonio Morsi, Lars Schewe, Martin Schmidt FAU Erlangen-Nürnberg, Discrete Optimization
More informationMachine Learning for Signal Processing Lecture 4: Optimization
Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) 11-755/18-797 1 Index 1. The problem of optimization 2. Direct optimization
More informationA fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]
A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] Yongjia Song University of Wisconsin-Madison April 22, 2010 Yongjia Song
More informationBayesian Methods in Vision: MAP Estimation, MRFs, Optimization
Bayesian Methods in Vision: MAP Estimation, MRFs, Optimization CS 650: Computer Vision Bryan S. Morse Optimization Approaches to Vision / Image Processing Recurring theme: Cast vision problem as an optimization
More informationAlgorithms for convex optimization
Algorithms for convex optimization Michal Kočvara Institute of Information Theory and Automation Academy of Sciences of the Czech Republic and Czech Technical University kocvara@utia.cas.cz http://www.utia.cas.cz/kocvara
More informationLecture 19: November 5
0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationResearch Interests Optimization:
Mitchell: Research interests 1 Research Interests Optimization: looking for the best solution from among a number of candidates. Prototypical optimization problem: min f(x) subject to g(x) 0 x X IR n Here,
More informationM. Sc. (Artificial Intelligence and Machine Learning)
Course Name: Advanced Python Course Code: MSCAI 122 This course will introduce students to advanced python implementations and the latest Machine Learning and Deep learning libraries, Scikit-Learn and
More informationNeural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018
Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must
More information10.7 Variable Metric Methods in Multidimensions
10.7 Variable Metric Methods in Multidimensions 425 *fret=dbrent(ax,xx,bx,f1dim,df1dim,tol,&xmin); for (j=1;j
More information3 INTEGER LINEAR PROGRAMMING
3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=
More informationSimplex of Nelder & Mead Algorithm
Simplex of N & M Simplex of Nelder & Mead Algorithm AKA the Amoeba algorithm In the class of direct search methods Unconstrained (although constraints can be added as part of error function) nonlinear
More informationA large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for
1 2 3 A large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for implicit (standard) and explicit solvers. Utility
More informationLocalSolver 4.0: novelties and benchmarks
LocalSolver 4.0: novelties and benchmarks Thierry Benoist Julien Darlay Bertrand Estellon Frédéric Gardi Romain Megel www.localsolver.com 1/18 LocalSolver 3.1 Solver for combinatorial optimization Simple
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing
More informationCombinatorial Optimization
Combinatorial Optimization Frank de Zeeuw EPFL 2012 Today Introduction Graph problems - What combinatorial things will we be optimizing? Algorithms - What kind of solution are we looking for? Linear Programming
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing
More informationCloud Branching MIP workshop, Ohio State University, 23/Jul/2014
Cloud Branching MIP workshop, Ohio State University, 23/Jul/2014 Timo Berthold Xpress Optimization Team Gerald Gamrath Zuse Institute Berlin Domenico Salvagnin Universita degli Studi di Padova This presentation
More informationCS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi. Convex Programming and Efficiency
CS 435, 2018 Lecture 2, Date: 1 March 2018 Instructor: Nisheeth Vishnoi Convex Programming and Efficiency In this lecture, we formalize convex programming problem, discuss what it means to solve it efficiently
More informationOptimization. there will solely. any other methods presented can be. saved, and the. possibility. the behavior of. next point is to.
From: http:/ //trond.hjorteland.com/thesis/node1.html Optimization As discussed briefly in Section 4.1, the problem we are facing when searching for stationaryy values of the action given in equation (4.1)
More informationAnalyzing Stochastic Gradient Descent for Some Non- Convex Problems
Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Christopher De Sa Soon at Cornell University cdesa@stanford.edu stanford.edu/~cdesa Kunle Olukotun Christopher Ré Stanford University
More informationThe Heuristic (Dark) Side of MIP Solvers. Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL
The Heuristic (Dark) Side of MIP Solvers Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL 1 Table of content [Lodi], The Heuristic (Dark) Side of MIP Solvers, Hybrid Metaheuristics, 273-284,
More informationOptimization in Brachytherapy. Gary A. Ezzell, Ph.D. Mayo Clinic Scottsdale
Optimization in Brachytherapy Gary A. Ezzell, Ph.D. Mayo Clinic Scottsdale Outline General concepts of optimization Classes of optimization techniques Concepts underlying some commonly available methods
More informationIntroduction to Modern Control Systems
Introduction to Modern Control Systems Convex Optimization, Duality and Linear Matrix Inequalities Kostas Margellos University of Oxford AIMS CDT 2016-17 Introduction to Modern Control Systems November
More informationOptimization Methods. Final Examination. 1. There are 5 problems each w i t h 20 p o i n ts for a maximum of 100 points.
5.93 Optimization Methods Final Examination Instructions:. There are 5 problems each w i t h 2 p o i n ts for a maximum of points. 2. You are allowed to use class notes, your homeworks, solutions to homework
More information